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ABSTRACT 


Automating the production of questions for assessment and self-assessment has become recently an active field of study. 
The use of Semantic Web technologies has certain advantages over other methods for question generation and thus is one 
of the most important lines of research for this problem. The aim of this paper is to provide an overview of the existing 
research of the subject of applying Semantic Web technologies for automatic question generation. The review provides a 
classification based on technological as well as on pedagogical aspects of the works presented. 
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1. INTRODUCTION 


Automated question generation has become an active field of study during recent years, pertaining to 
methods and software tools that generate assessment and self-assessment content from various sources of 
information, including text, data and knowledge bases and specialized formal language descriptions. The 
learning content, output to this process, supports forms of assessment such as multiple-choice questions and 
problem definitions, e.g. word problems in mathematics. 

Question generation is an interdisciplinary approach, involving several different scientific fields, 
including discourse analysis, natural language processing, instructional design, cognitive psychology, 
psychometrics/ educational measurement, and artificial intelligence/ knowledge representation. The methods 
found in the related literature can be classified into the three following method types: First, methods that rely 
on Natural Language Processing (NLP) techniques use automated analysis of texts for the generation of 
questions (Le et al 2014), although they can also use other sources such as Wordnet and DBpedia. Second, 
template-based methods Al-Yahya (2014) are using certain patterns/templates for questions, which are filled 
by certain values from large sets of allowed ones. The most important method of this category is Automatic 
Item Generation (AIG) (Gierl and Haladyna 2012). Finally, semantic-based methods (Alsubait et al 2013) are 
using the semantic web specifications, mainly ontologies, in order to automate assessment generation. 
Semantics-based approaches are considered to have some advantages compared to natural language resources 
(Le et al 2014) and they are in the focus of this paper. Very few reviews on question generation exist in the 
literature. In a recent review of ontology-based question generation (Rakangor and Ghodasara 2015) no 
pedagogical/ cognitive aspects of the works under consideration have been taken into account. Another 
existing review (Le et al 2014) focuses on question generation using natural language generation techniques, 
while this review focuses on semantic web and ontologies. 

The following research questions are considered: 1) What are the kinds of questions generated by 
semantic-web based automated assessment? 2) What technical methods from knowledge representation and 
the Semantic Web are used for question generation? 3) What kind of knowledge/learning is assessed by the 
generated questions? 
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2. CLASSIFYING ONTOLOGY-BASED QUESTION GENERATION 


All the works studied in this review generate question items based on ontologies. Ontologies are the basic 
components of the Semantic Web, providing formal conceptualizations of specific domains. They contain 
axioms that involve classes of things in a domain, certain individuals as well as relations among these 
individuals. We analyzed the above works according to the kinds of axioms that they used for question 
generation, as well as according to the kinds of questions generated. 

Further, in order to identify the kind of knowledge that is assessed by the automatically generated 
assessment under consideration, we apply the well-known Bloom taxonomy of learning objectives (Anderson 
et al, 2001) in the cognitive domain. This taxonomy proposes six levels of learning, from the most basic to 
the more complex, knowledge, comprehension, application, analysis, evaluation and creation/synthesis. For 
each one of the above, a finer categorization is defined. This taxonomy is meant to drive instructional design 
so that an assessment item should address one or more objective from the above categories. 


2.1 Multiple Choice and Closed-Type Questions 


The most common type of assessment items automatically generated is closed-type questions, especially 
multiple-choice and fill-in-the-blank questions. A multiple-choice question is a type of question item in 
which the students have to choose a correct answer (key) among a set of incorrect alternatives (distractors). 
The textual question/prompt of an MCQ is called stem. Based on the underlying ontological axioms, various 
approaches have been proposed for question generation. Tan, Kiu and Lukose (Tan et al 2012) define three 
types of questions, based on the underlying ontological structures: class-based, property-based, terminology 
based questions. Furthermore, according to Al-Yahya (2014) there are three types of question stem 
generation strategies: class membership, individual and property. In this review we propose a similar 
categorization that identifies the following types of multiple choice questions: Class membership, that 
involves relationships among individuals and classes, that is, ontological categories, property-based, that 
involves relations among individuals, as well as datatype properties among individuals and certain values, 
and terminology based questions, that involve only relationships among classes through hierarchies and 
roles/properties. As it will be shown later, this categorization will allow us to draw conclusions about the 
kinds of knowledge/learning that are assessed by the analyzed works. 


2.1.1 Class Membership 


In this category, questions are based on the identification of instances of specific classes in the ontology, 
taking into account the hierarchy of concepts. Examples of this type of questions are “"Which of the 
following items is (or is not) an example of the concept, X?" and “Is optical mouse computer 
function?’.While most approaches use simple patterns/ queries in order to identify appropriate questions 
(Holohan et al, 2005; Papasalouros, Kotis and Kanaris, 2011; Zitko et al, 2009; Cubric and Tosic, 2011; 
Bouzekri et al, 2015; Bongir, Attar and Janardhanan, 2018), some advanced methods have also been 
provided in the literature. Thus Vinu and Kumar (2015) generate class membership questions by defining for 
each instance a set of concepts (classes) named node-label-set that contains the set of classes and restrictions 
that are satisfied by the specific instance. An example of such question stem is “Choose a Hogwarts Student, 
a Wizard, a Gryffindor and a Halfblood, having exactly one Owl as Pet.” A common technique in most of the 
above works for generating distractors is based on disjoint classes, that is, classes of things that are not 
allowed to have common elements, such as Animal and Plant. 

This type of questions assesses understanding, since, according to the revised Bloom taxonomy, 
classification/subsumption of individuals into categories falls into this level of learning. 


2.1.2 Property-based Questions 


Property-based generated questions involve relations among individuals as well as properties that relate 
individuals to certain values. Examples of the first kind are “Who is Mark married to?” and “Trevor is the 
pet of...”, and they are supported by the majority of the studied works (Papasalouros et al, 2011; Cubric and 
Tosic, 2011; Foulonneau, 2012; Al-Yahya, 2014; Alsubait et al, 2014; Stancheva et al, 2016; Bongir et al, 
2018). An example of the second category (datatype properties) is “Yazeed Althalith ruled for a period of ... 
months.” and they are also supported by many of the analyzed works. Again, apart from relatively simply 
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querying/filtering methods, Vinu and Kumar (2015) apply an elaborate algorithm for identifying pairs of 
individuals that participate in relationship, generating an edge-label-set, which is the set of all property 
relations from one instance to the other. 

Depending on the kind of question, e.g. multiple-choice question or fill-in-the-blank, this kind of 
questions involves the identification or recall of certain information from the student performing the 
assessment, so we consider that they are assessing knowledge, which is the most basic form of learning 
according to the revised Bloom taxonomy. 


2.1.3 Terminology-based Questions and Rules 


Terminology-based questions refer to questions that involve only relationships among classes through 
hierarchies and roles (relations). Examples of this kind of questions are “Where does an instructor work?” 
and “Which category does Asthma belong to?” (Zitko et al, 2009; Al-Yahya 2014; Cubric and Tosic, 2011; 
Papasalouros et al, 2011; Foulonneau, 2012; Alsubait et al, 2014; Lopetegui et al, 2015; Stancheva et al, 
2016; Stasaski and Hearst, 2017). These types of questions do not refer to concrete individuals, thus are 
considered to assess higher levels of knowledge, besides the mere recall of certain facts. More specifically, 
they involve cognitive processes of classification/ subsumption and generalization, since they are dealing 
only with concepts of various levels of abstraction, as well as with relationships among these concepts. Thus, 
according to the Bloom taxonomy they assess at the level of understanding. 

Another special form of terminology-based questions are rule-based questions that employ Semantic Web 
rules (Zoumpatianos et al 2011). An example of this kind of questions is “What is a(n) person that holds a 
diploma issued by an engineering school?”. Beyond subsumption, rule-based questions involve inferential 
thinking which refers to the level of understanding according the Bloom taxonomy. 


2.2 Other Types of Questions: Analogy Exercises, and Multimedia Questions 


A special form of closed-type question found in the literature is analogy questions (Cubric and Tosic 2011; 
Alsubait et al 2014). These questions ask students to compare and identify analogies among concepts and 
individuals in different structures, for example “Jnstructor to University is as ........ LONE oie. ?”, for concept 


of knowledge at the level of understanding, although some authors consider that analogy questions belong to 
the level of analysis (Alsubait et al 2012). 

Apart from closed-type questions, such as multiple choice and fill-in-the-blank, assessment material in the 
form of problems and exercises has been automatically generated based on Semantic Web technologies. 
Although closed-type question methods are independent of the subject domain, the generation of problems 
and exercises depends on the domain, e.g. Mathematics or programming. Thus, Holohan et al (2006) use a 
specialized ontology that describes relational databases in order to produce exercise descriptions asking for 
the corresponding SQL query in given ontological mappings of databases. An example is “Show the 
teacherlevel, teamskill of each of the teachers whose teachername is ‘Edmond’ and teachercourse is 
‘databases’.” Williams (2011) uses properties (relations) with class membership axioms and certain 
refactoring and aggregation techniques in order to provide simple word problems that involve arithmetic 
operations, in the form “Benbecula and South Uist are islands that are members Benbecula and South Uist 
are islands that are members of the Uists and Barra Archipelago. Benbecula has a population of 1219. South 
Uist has a population of 1818. What is the ratio of the two populations?” Exercises and problems generated 
by both the above methods can assess simple problem solving, in the form of applying certain rules and 
procedures to familiar and unfamiliar tasks, so they relate to the application level of the revised Bloom 
taxonomy, assessing procedural rather than declarative knowledge. 

Finally, there exist some approaches that generate image questions, such as hotspot questions aiming at 
questions such as “Jdentify Churchill in the picture” or “Identify the leader of a country in the picture.” 
(Papasalouros et al 2011). Again, this kind of questions ask student for the identification of certain pieces of 
information based on visual clues and is considered to assess basic knowledge. 
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3. CONCLUSIONS 


Using Semantic Web technologies for automatically assessment items has given some promising results that 
are presented in this short article. Closed-type questions, as well as simple exercises have been generated, 
that assess not only basic knowledge but also understanding and simple problem solving skills. Some works 
have demonstrated the pedagogic potential of their approach by applying methods from educational 
measurement as well as by relying on certain theories of learning and assessment. Nevertheless, more work 
needs to be done so that this line of research may fulfill its pedagogic potential. 
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